Contextually-Mediated Semantic Similarity Graphs for Topic Segmentation

نویسندگان

  • Geetu Ambwani
  • Anthony Davis
چکیده

We present a representation of documents as directed, weighted graphs, modeling the range of influence of terms within the document as well as contextually determined semantic relatedness among terms. We then show the usefulness of this kind of representation in topic segmentation. Our boundary detection algorithm uses this graph to determine topical coherence and potential topic shifts, and does not require labeled data or training of parameters. We show that this method yields improved results on both concatenated pseudo-documents and on closed-captions for television programs.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Diachronic semantic cohesion for topic segmentation of TV broadcast news

This paper proposes a new way to integrate semantic relations into a topic segmentation process by defining the notion of semantic cohesion. In the context of a sliding window based automatic topic segmentation algorithm, semantic relations are incorporated in the similarity measure between adjacent blocs. Additionaly, in the context of TV Brodcast News topic segmentation, we propose a new prot...

متن کامل

Similarity for Natural Semantic Networks

A natural semantic network (NSN) represents the knowledge of a group of persons with respect to a particular topic. NSN comparison would allow to discover how close one group is to the other in terms of expertise in the topic— for example, how close apprentices are to experts or students to teachers. We propose to conceive natural semantic networks as weighted bipartite graphs and to extract fe...

متن کامل

An Orthonormal Basis for Topic Segmentation in Tutorial Dialogue

This paper explores the segmentation of tutorial dialogue into cohesive topics. A latent semantic space was created using conversations from human to human tutoring transcripts, allowing cohesion between utterances to be measured using vector similarity. Previous cohesionbased segmentation methods that focus on expository monologue are reapplied to these dialogues to create benchmarks for perfo...

متن کامل

Transcript Segmentation Using Utterance Cosine Similarity Measure

One of the problems addressed by the Tracker project is the extraction of the key issues discussed at meetings through the analysis of transcripts. Whilst the task of topic extraction is an easy task for humans it has proven difficult task to automate given the unstructured nature of our transcripts. This paper proposes a new approach to transcript segmentation based on the Utterance Cosine Sim...

متن کامل

Automatic Hashtag Recommendation in Social Networking and Microblogging Platforms Using a Knowledge-Intensive Content-based Approach

In social networking/microblogging environments, #tag is often used for categorizing messages and marking their key points. Also, since some social networks such as twitter apply restrictions on the number of characters in messages, #tags can serve as a useful tool for helping users express their messages. In this paper, a new knowledge-intensive content-based #tag recommendation system is intr...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2010